In many cases, the meaning of a word or phrase is defined by, or is at least evident in, surrounding words or phrases. Thus, for a given word or phrase, a word or phrase that occurs in a similar context will tend to have the same or similar meaning. These types of pairs of words or phrases that have the same or similar meaning can be useful for a wide variety of language processing applications such as, but certainly not limited to, paraphrase generation and language translation.
The world-wide-web (a.k.a., “the web”) consists of an explicitly interlinked network of documents. But implicit in the web is a more subtle kind of informational network, namely an implicitly linked network of overlapping pieces of linguistic expression. Many pages, for instance, contain the string “walked down by the river”, however few if any of these pages are linked to one another, and nothing explicitly reflects the fact that all these pages share an identical chunk of linguistic content. There is a broad range of language processing applications that could benefit from systems or methods for effectively analyzing these types of overlapping pieces of linguistic expression so as to identify pairs of words or phrases that have the same or similar meaning.
The discussion above is merely provided for general background information and is not intended for use as an aid in determining the scope of the claimed subject matter.